Post Similarity search using vector databases#2261
Merged
iocanel merged 1 commit intoquarkusio:mainfrom Mar 27, 2025
Merged
Conversation
|
🙈 The PR is closed and the preview is expired. |
ea893ff to
5008cb9
Compare
geoand
reviewed
Mar 19, 2025
b99b1de to
6a4a667
Compare
Contributor
Author
|
@geoand applied feedback. |
gsmet
reviewed
Mar 19, 2025
Member
gsmet
left a comment
There was a problem hiding this comment.
Nice article, I spotted a few typos here and there, HTH.
| With LLMs becoming increasingly popular we often see them being used even for tasks that are not directly related to text generation. | ||
| Such case is using LLMs for recommendation systems. In this post we'll see how you can build such a system using https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus Langchain4j] | ||
| but without using LLMs. More specifically we'll create a simple movie similarity search system using a vector database. The role | ||
| of https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus Langchain4j] in this store is to abstract the underlying vector database through the `EmbeddingStore` interface. |
Member
There was a problem hiding this comment.
Wasn't sure what you wanted to write but store looked odd?
Suggested change
| of https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus Langchain4j] in this store is to abstract the underlying vector database through the `EmbeddingStore` interface. | |
| of https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus Langchain4j] in this story is to abstract the underlying vector database through the `EmbeddingStore` interface. |
| but without using LLMs. More specifically we'll create a simple movie similarity search system using a vector database. The role | ||
| of https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus Langchain4j] in this store is to abstract the underlying vector database through the `EmbeddingStore` interface. | ||
|
|
||
| A relevant sample has been recently added to the https://github.com/quarkiverse/quarkus-langchain4j/tree/main/samples/[Quarkus Langchain4j samples]. |
Member
There was a problem hiding this comment.
Could you fix LangChain4j case everywhere?
| </dependency> | ||
| ---- | ||
|
|
||
| To be able to use these dependencies without needing to specify versions, the bom can be add imported to the `dependencyManagement` of the project: |
Member
There was a problem hiding this comment.
Suggested change
| To be able to use these dependencies without needing to specify versions, the bom can be add imported to the `dependencyManagement` of the project: | |
| To be able to use these dependencies without needing to specify versions, the bom can be added to the `dependencyManagement` of the project: |
| </dependency> | ||
| ---- | ||
|
|
||
| To properly use the in process embedding model we need to configure it in the `application.properties` file. |
Member
There was a problem hiding this comment.
Suggested change
| To properly use the in process embedding model we need to configure it in the `application.properties` file. | |
| To properly use the in-process embedding model we need to configure it in the `application.properties` file. |
| ---- | ||
|
|
||
| To properly use the in process embedding model we need to configure it in the `application.properties` file. | ||
| We also need to configure the pgvector dimension an ensure it's aligned with the dimension of the embedding model. |
Member
There was a problem hiding this comment.
Suggested change
| We also need to configure the pgvector dimension an ensure it's aligned with the dimension of the embedding model. | |
| We also need to configure the pgvector dimension and ensure it's aligned with the dimension of the embedding model. |
| } | ||
| ---- | ||
|
|
||
| To use the CSV mapper, we'll need to `jackson` csv dataformat: |
Member
There was a problem hiding this comment.
Suggested change
| To use the CSV mapper, we'll need to `jackson` csv dataformat: | |
| To use the CSV mapper, we'll need to add Jackson's CSV dataformat dependency: |
|
|
||
| ==== Bringing it all together ==== | ||
| The only thing that's left is to create a REST endpoint that will allow us to search for similar movies. We could also use a simple UI. | ||
| Let's start with the REST endpoint. It's pretty straight forward. We need to methods one for movie searching and one for searching similar movies. |
Member
There was a problem hiding this comment.
Suggested change
| Let's start with the REST endpoint. It's pretty straight forward. We need to methods one for movie searching and one for searching similar movies. | |
| Let's start with the REST endpoint. It's pretty straightforward. We need two methods, one for searching movies and one for searching similar movies. |
|
|
||
| The key elements of that page are: | ||
|
|
||
| * movie-box: a text filed for entering the movie title |
Member
There was a problem hiding this comment.
Suggested change
| * movie-box: a text filed for entering the movie title | |
| * movie-box: a text field for entering the movie title |
| * movie-poster: an image for displaying the movie poster | ||
| * similar-results: an additional unordered list for displaying the similar movies | ||
|
|
||
| It's important to remember that the `Movie` entity is using `jackson` to map the CSV columns to the entity fields. |
Member
There was a problem hiding this comment.
Suggested change
| It's important to remember that the `Movie` entity is using `jackson` to map the CSV columns to the entity fields. | |
| It's important to remember that the `Movie` entity is using Jackson to map the CSV columns to the entity fields. |
| </html> | ||
| ---- | ||
|
|
||
| I won't go into much detail about the hmtl code as it's outside the scope of this post. |
Member
There was a problem hiding this comment.
Suggested change
| I won't go into much detail about the hmtl code as it's outside the scope of this post. | |
| I won't go into much detail about the HTML code as it's outside the scope of this post. |
6a4a667 to
137c925
Compare
Contributor
Author
geoand
approved these changes
Mar 26, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a post on how to use quarkus with vector databases to implement a similarity search example.